-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter regex patterns in addition to prefixes #59
Conversation
This breaks any code currently using this functionality. Maybe the existing API can be kept and the pattern filter introduced beside it? Updating the rmw implementations to always provide patterns seems fine since that won't affect users (except those implementing their own rmw impl. with custom filters). |
If every prefix can be expressed as a regex, I think it's better for us to drop support for the prefix option so that we don't have to maintain it and just update our usage to the new API (no one else on github uses this function) If, however, we really like the simplicity of the prefix option, we could keep support for it, but that's a different reason |
@dirk-thomas are you in favour of keeping support for the prefix filtering, or was it just an idea you were floating? If so, I can change it, it's easy to do, I'd just like to understand the reasoning (since I think you're usually in favour of maintaining as few options as possible) |
Yes, I think we should keep supporting prefix filtering. While every prefix can be expressed as a regex it goes imo against the design principle that a user shouldn't "pay" for a feature he isn't using.
If multiple options achieve the same think I am certainly in favor reducing them to the minimal amount. In this case I see the two options as functionality different though: the prefix filtering is the "best" approach if that is sufficient for the use case, the regex allows more complicated filters but come with the "price" of additional overhead. |
ok, cool. originally it seemed because of api breakage. this makes sense to me thanks, will update |
b617866
to
eb33c9f
Compare
This PR has been updated to add CI with the attached PR updating the connext license output ros2/rmw_connext#242: |
|
||
if filtered_rmw_implementation: | ||
rmw_output_filter = get_rmw_output_filter(filtered_rmw_implementation) | ||
self.filtered_prefixes.extend(rmw_output_filter) | ||
self.filtered_patterns.extend(rmw_output_filter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should the rmw filters need to be patterns? If a prefix filter is sufficient it would be good if an rmw impl. could use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood from your comment above that, while we want users to give users the choice between patterns or prefixes, it's not necessary for "internal" rmw filters:
Updating the rmw implementations to always provide patterns seems fine since that won't affect users (except those implementing their own rmw impl. with custom filters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was 20 days ago - a lot has happened since then 😉
Sure, we could make all rmw impl. use regex since we kind of maintain most of them. That was the main rational in the first review. But my latest comment was about efficiency - why use a regex if a prefix does the job just fine? I haven't spend any thought on how this could be done. An rmw impl. would need to somehow declare the type of their filters - either by using a different resource key or embedding the type information in the resource. The first seems to be easier.
@@ -70,6 +77,9 @@ def on_stdout_lines(self, lines): | |||
|
|||
for line in lines.splitlines(): | |||
# Filter out stdout that comes from underlying DDS implementation | |||
# Note: we do not currently support matching filters across multiple stdout lines. | |||
if any(re.fullmatch(pattern, line) for pattern in self.filtered_patterns): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a fullmatch
? Could it just use match
instead and leave it up to the regex to use ^
and $
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was just trying to avoid regexes from matching accidentally, but I agree that that should be the choice of the user.
If we just use match
, which only matches at the start of a line, then actually the prefix option could then be collapsed back into the pattern option... in which case we can just document that filters can be literal strings of prefixes for those users just wanting the simplicity of specifying prefixes. How's that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you can determine if the filter is supposed to be a string or a regular expression. E.g. if it contains a .
you can't guess the semantic. Therefore I think you need to know the type of the filter (prefix string vs. regex) in order to invoke the corresponding logic.
Again, I think we should never use a regular expression match when a string match is sufficient simply because of the unjustified overhead.
OK this PR now extends both the API and the rmw-specified filters to accept both prefixes and regexes. ros2/rmw_connext#242 has been updated to clarify it's registering a prefix to filter but otherwise no change to the codebase. CI with this PR and ros2/rmw_connext#242: |
if any(line.startswith(prefix) for prefix in self.filtered_prefixes): | ||
continue | ||
if any(re.match(pattern, line) for pattern in self.filtered_patterns): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was first worried that using the same pattern without compiling it repeatedly would be slower. But it turns out modern Python already do a great job compiling the pattern and caching it for future use. So all good here 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I should have mentioned that I checked the same thing, to avoid you looking into it. it might be worthwhile if there are a lot of unique regexes, but for now we don't have that case
thanks for the nitpick commits |
this will allow us to filter out lines based on regexes instead of just prefixes. note that even if you specify a multiline regex it will only ever be matched against one line of console output at a time (this was also true how we were matching the prefixes before)
this was motivated by wanting to filter things like the fastrtps error message in
http://ci.ros2.org/view/nightly/job/nightly_osx_repeated/772/testReport/(root)/projectroot/test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp/
where prefix matching wouldn't be sufficient. Even though we probably don't want to filter that error message in general, I found this useful while investigating flaky tests. Using this I could run tests N times ignoring this error, preventing unrelated test failures from its output.
CI including relevant changes in dependencies (for which I'll open PRs if this change looks OK conceptually)